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ABSTRACT 


The article describes the issues of preparation and verification of mathematical 
models of computing systems with resource virtualization. The object of this 
study is to verify of mathematical models of computer systems with 
virtualization experimentally by creating a virtual server on the host platform 
and monitoring its characteristics under load. Known models cannot be applied 
to the aircraft with virtualization, because they do not allow a comprehensive 
analysis to determine the most effective option for the implementation 
of the initial allocation of resources and its optimization for a specific sphere 
and task of use. The article for the study used a closed queueing network. 
Simple models for the analysis of various structures of computer systems 
are experimentally obtained. To implement the properties of adaptability in 
the models, triggers are used that monitor and adjust the power 
of the processing channel in individual Queuing systems, depending on 
the specified conditions. Experiments prove the obtained results reliable 
and usable as a flexible tool for studying the virtualization properties when 
structuring computing systems. This knowledge could be of use for businesses 
interested in optimizing the server configuration for their IT infrastructure. 
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1. INTRODUCTION 

Data storage and processing infrastructure is one of the crucial components of corporate IT systems; 
its effectiveness is fundamental to the business performance in a dynamic and competitive market, 
which is why computing systems (CS) and data storage systems of today shall meet stringent requirements. 
As such, they must be able to adapt to rapidly changing tasks and objectives; to guarantee the required 
application performance; to be have necessary scalability with an option to increase resources in-service; 
to minimize downtime due to failures or maintenance; to be easy to use and maintain. The most efficient way 
to meet such requirements is to use virtualized CS; atthe operating system(OS) level, this technology uses up 
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to 20% of the server CPU capacity. However, this research aims at describing the creation of efficient CS 
models, which is why such models shall be based on natural virtualization that uses more efficient 
software/hardware-level mechanisms. Based on this, the following problems are relevant today: creating 
models of virtualized CS; model implementation and application feasibility testing, e.g. using virtual servers 
to test and develop software, to set up a remote office, or to rent out as a basis for outsourcing 
im computing, etc. 


2. METHODOLOGY 

This is generally an exploratory paper. While studying the subject matter, the authors hereof have 
analyzed a bulk of literature [1-10] to find uncovered or unresolved issues, such as using virtual servers to 
study and verify models of virtualized systems. Some issues relating to the possibility of creating 
and verifying a mathematical virtualized-CS model are not properly covered in papers; however, [11-14] 
address the most problematic issues in part. The goal hereof is to analyze the existing CS that use natural 
virtualization to describe how models or their implementations (virtual servers) could be used, to study them, 
and to obtain the results of using a virtual server to verify a virtualized-CS model; the characteristics and 
models of such server are detailed in [15]. Another goal is to develop a method for making virtualization -based 
models of adaptive CS. These issues are relevant today in view of global computerization and nearly universal 
use of big (and various) data and virtual servers. To attain this goal, the following must be addressed: describe, 
and prepare the source data of, virtualized-CS models for different classes of tasks; adapt the mathematics 
behind the queueing theory to computing virtualized-CS models, i.e. to verifying such models by means of 
virtual servers; develop a method for constructing and evaluating adaptive-system models. The research 
methods used herein are based on the queueing theory as well as on mathematical statistics and experimental 
model verification. 

Virtualization means a variety of methods for abstracting from various physical computational 
resources (CR). Virtualization tools can represent a single physical resource as a set of separate logically 
independent resources (logical servers) to isolate applications from each other; conversely, virtualization can 
combine separate physical resources within a heterogeneous structure, be it servers or drives, into a single 
logical resource. CPU virtualization is possible in theory as substantiated by the Church -Turing thesis [16-19]. 
The thesis is essentially about computer simulation of a Turing machine (an abstract computing machine), 
which is assumed possible; the assumption means that as tools for handling algorithmic problems, 
all computers are equivalent regardless of their implementation. The thesis is not a proven theorem; 
nevertheless, it suggests that any computing environment can be simulated by another such environment. 
Important theoretical research into CPU virtualization was carried out by Gerald Popek and Robert Goldberg 
in the form of three virtualization requirements [1]: equivalence; resource management; efficiency. 

Server efficiency is very low, especially in the case of x86 servers; its commonly recognized level 
is about 5% to 15%. Such efficiency largely depends on the coherence of CPU and server architecture with 
the operating system. If that coherence is good, as is the case of the RISC/Unix combination, server efficiency 
may reach 25-30% or above [19-25]. Virtualization can raise this figure to above 85% while improving 
the reliability, scalability, and other characteristics critical for data centers; besides, it helps save the costs 
of hardware, support, and administration. The existing models are not applicable to virtualized CS as they 
cannot run comprehensive analysis to find the most efficient way of initial resource distribution and to optimize 
such distribution for a particular application while CS is running. Simulation models are common; they 
simulate the behavior of a real system by introducing special conditions and lags that configure the sequence, 
i which the system components transition from one state to another. One important advantage simulation 
model has over analytical models is that a simulation model could potentially be made even closer to 
the simulated object by injecting additional complications. However, it should be borne in mind that complex 
simulation models require substantialCR to run, which means that such models are only advisable if analytical 
methods are not suitable. 

Literature review shows that Russian and internationalresearchers mostly use less intensive analytical 
methods suitable for parametric analysis and optimization. The time characteristics of systems can be assessed 
by the queueing theory. To produce the estimated ratios that comprise mathematical models, analytical methods 
require constrains and assumptions that limit their applicability. Thus, the models proposed by L. Kleinrock 
and M. Schwartz [21, 22] consider a message-switching communication network consisting of M channels 
and N switching nodes. The mathematical model uses the following assumptions: all channels and all 
switching nodes are noiseless and absolutely reliable; switching -node processing time is zero; the transmitting 
end of a channelcan queue messages in an unlimited memory; the traffic the communication network receives 
from external sources (e.g. from host machines) forms a Poisson process; for many analytical relations, 
the exclusive path is known for each transmitter-receiver pair; for some problems, the probability pG,k) 
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of transition from the jth node to the kth node is introduced; message lengths are independent and distributed 
by an exponential law. These constrains and assumptions can be used to find the time ti a message stays in 
the network; the communication channel load factors r(r,v), and the queue lengths li. They also help address 
the issues of efficient design. In [21], L. Kleinrock focuses on three problems: configuring the channel 
throughput; configuring the distribution of streams in channels; and selecting the network topology. These 
are single-attribute problems that minimize the mean messaging latency in the communication network 
while keeping the costs within the required limits. Methods developed and summarized in [23] consider 
packet-switching networks that are studied as bipolar multiphase queueing systems. These methods use 
the following assumptions and constraints: the distribution of any random variable is assumed exponential 
except the third queueing phase, where the service time distribution is deemed regular; the specific subscriber 
load at subscriberterminals and computers is deemed uniformly distributed network-side; the message queue 
discipline is FIFO; the time to establish a logical connection is included in the switching time; the queueing 
systemis non-priority; while transmitted over the network, messages age at the specified rate. The basic criteria 
of evaluating a data transmission network are usually the probability a message will be delivered in time; 
and the mean delivery time. 

Despite the well-elaborated nature of the existing approaches, some of which have evolved into 
engineering methods, these models have one significant disadvantage: they cannot comprehensively consider 
both the intra-model information flows and changes in the components of the modelitself due to random factors 
such as hardware failures or CS reconfigurations, which are typical for naturally virtualized systems. Consider 
the mathematical basis,i.e. the closed queueing networks (CQN)-based calculation method [6-9, 24, 25] chosen 
because such networks are used to represent processes occurring in CS with limited number of requests; 
the limitation is due to an inherent limitation, in this case, the limited number of CPUs available to the CS. 
For instance, multiprocessor systems (MPS) can only connect a limited number of CPUs to a shared bus. 
In case of virtualization, we havea pool of CPUs, RAM, and input-output adapters. 

On the other hand, one example of a QN is the simplest multi-program computer, where the finite 


number Ù of programs corresponding to a multiprogramming level will turn to one of the M CPUs, i.e. be 
processed by M CPUs at the probability 47” FLAL The mathematics behind the CQN is analyzed 
and summarized below. When running, each of the M CPUs requests the hypervisor to grant access to 
a resource, i.e. RAM, an external storage (ES), or an I/O adapter. While the hypervisor grants such access to 
a CPU, others process data from their caches or local RAM, or wait for a similar access permission; as such, 
they do not generate new queries to the hypervisor until the running CPU frees the resource requested by 
another CPU; in the case oftime-sharing, such suspension lasts untilthe running CPU’s allocated cycle is over. 
Figure 1 shows the general information-flow model of a virtual server with a limited number of requests. 


Ti 


In a closed model, requests come from the system So that contains M channels >» IM and displaying 


the CPUs as they function in an MPS. The parameter D of the model equals the mean time the CPU spends 
to analyze the results of processing its preceding request [12-14]. 


Virtual server 





Figure 1. Information-flow model of a virtualization server 
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In this model, the number of circulating requests equals the number of QN channels Sọ, hence no 
queue. The rate Aù at which requests come from the system Sọ to other systems S,,...,8, depends on 


the number of requests in the system Sp. 


ET ; 


where M j is the number of requests in the jth queueing system(QS) of the network. 


Given equal intensity of the input/output request streams Sg, the rate A, will determine 
the performance of the simulated MPS, i.e. the mean number of CPU queries processed by the hypervisor per 
unit of time. Consider a CQN with exponentially distributed request processing time in each of the systems 


S; (j = 1,...,7). For each of the network’s systems, define the parameters: K,; is the number of channels; 


J 


. 1s the transfer factor. Another known variable 


a is the mean channel-specific request processing time; @; 


is the number M of circulating requests. The parameters K i a , &;,and M are source data for calculating 


the network’s steady state, in particular the probabilities of its states, in terms of which all other characteristics 
are given [5]. 

Find the expression for the loads p; of the systems S j- For a single-channel QS, a load is a difference 
between | and the probability that this QN is idle. The probability that the system S j has exactly r requests 
while the requests of other systems are distributed in any possible combinations is written as 
Pr(M, = r) = > Pr(M,,....M,,). Then the probability that a single-channel system does not have any 

M j=" 


current requests is Pr(M, = 0) = Pr(M,.....M,, ) . Therefore, 


M= 


p; =1-Pr(M; =0)=1- >° Pr(M,....,M,,) (2) 


K;-k;= X (K;-r)Pr(M; =r) (3) 


where A, is the number of channels in the system; k j is the mean number of busy channels; Pr(M T r) 


is the total probability of all states from the set A(M „n), for which M; =r . The load of each channel in 
a multichannel system S j is defined as the difference between 1 andthe mean number of idle channels from 
the total number of channels 

K,- 


K 2 Pr(M =r) (4) 








Pj =|- 


From (3), find the mean number of busy channels in the system S F 


.—] 
k;=K;- > (K; =r)P:(M, =r) (5) 


Apparently, P; = k j / K j - Given that for a multichannel QS k r AG: , obtain the expression for the incoming 


stream rate. 
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To calculate the mean number of channels m; and the mean number of channels l j Gncoming and waiting in 


the system oy ), use the expressions, 


m; -ŞrPr(M, =r) (7) 
r=0 
M 

L, = 2 (r-K;)P(M, =r) (8) 


The mean stay time u; and the mean waiting time @; in the systems S; (J=Lssn) equal 


u;=m;/A; and @; =1;/A,, respectively, where 1;, m 


j» and l; are found from (6) to (8). Refer to 


the mean time interval between two consecutive exits of a request from the system S j as the systemcycle 


time. The mean stay time of a request in the system S, over its single stay in the system S; equals (4/4 Ju, ; 


The mean cycle time U can be found by summing these values for all the systems in 


n 
the network U; =X (4/4; )u; . Given that the mean stay time of a requestin thesystem S; u,; =m,/A, , get 
j=l 


n n 
Rey es oe (9) 
a4, 4 Ain Aj 

Given that we are expected to model virtualized CS, it is safe to say that the obtained models will 
depend on, and self-adapt to, the load on a rule-based principle that will make use of the data received while 
computing a model. It is also worth noting that the mean queue length and performance will depend on 
the QS service rate. Mean queue length is a monotonically decreasing (increasing) function, while 
the performance is a monotonically increasing (decreasing) function of the service rate. 


3. INPUT DATA FOR MODELS 

The initial number of QS channels is set forth in the assumed number of dedicated or shared virtual 
devices for each specific virtual server. The concept of QS channel number is replaced with the concept 
of allocated processing power, given as a percentage or proportion of a whole processing channel/CPU. 
For instance, two channels can be assigned to a CPU, which will mean the virtual server has two virtual CPUs; 
if the model is assigned two QS CPUs, it means that the virtual server has two CPUs, each of which can be 
further divided into virtual CPUs as channels. On the otherhand,a CPU QS can be allocated specific processing 
power starting from 0.1 and incrementing at 0.01, which corresponds to similar virtualization properties. 

The meaning devices are set in the model in a similar way, ie. if an input-output device is not 
dedicated to a particular virtual server, or if a virtual server uses a shared device provided by the virtual I/O 
server. Such a device could bea network adapter, the I/O adapterto access ES of different types. Before models 
could be built for studies, consider the structural diagram of a host platform used for creating virtual servers as 
Shown in Figure 2. It shows all the components of future virtual servers; unused resources are pooled together 
to form the CPU pool, the RAM pool, the ES pool, etc.; the diagram also shows the primary component 
of resource virtualization, a POWER hypervisor, and zoomed-in diagrams of model virtual servers. 
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Figure 2. Structural diagram of a host platform 


Figure 3 shows a part of Figure 2 (virtual servers) in detail; it also demonstrates virtual 
input-output servers with CP distribution and logical links between the system components. The diagram 
of a non-virtualized MPS would be functionally similar except that it would use a fixed amount of hardware 
resources instead of pools. Re-configuring a server with a fixed amount of hardware resources will at least 
require a server shutdown; besides, it mght require reconfiguring the runtime environment, the OS, or the app 
server. When using virtualization, resources can be added to or removed from the configuration while 
the system is running. To model such operating conditions using the selected model computing method, the 
method can be adjusted or use in two ways: 

a. Either use multichannel QS where the number of channels can be adjusted during simulation. This might 
be inappropriate for percentage distribution of CPU power when the number of channels Kj 
and the number of busy channels kj in a multichannel QS could be a fractional number 

b. Or simulate the process by adding or removing QS dynamically, which will entail a full rebuild 
of the transmission probability matrix and recalculating all of the earlier collected statistics, which 
is not acceptable 
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Virtual servers (logical partitions) 
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Figure 3. Structural diagram of logical partitions 


The first option is a more optimal choice; however, some of the formulas have to be modified, 
e.g. the formula for finding the mean number of busy channels in the QS, as in this case, the CPU power can 
be distributed starting at 0.1 x the CPU-total power. 


p; npu K,=1 


sja 
page À npu K,>1 (10) 


where the load of the multichannel systemis defined as 9; = k j / K j =A 9; / K j - Another formula that uses 


factorial thatis generally only applicable to integers 


1/M ;! npu M;<K; 
R. 


i( i) (K; 1K" “| npu M,>K, oo 


The problem of calculating the factorial of a fractional number can be solved by using 
the asymptotic factorial formula (the Stirling formula) that can calculate the approximate factorial (12). 
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n!= azn * e t tofa) (12) 
e 12n 288n^ 51840n 


where O capitalis the mathematical notation for comparing the asymptotic behavior of functions, which means 
the way the function is altered when approaching a certain point. The essence ofthe term O capital depends on 


the application; however, it never O( f ) grows faster than f .In many cases, approximate calculation of a 


factorial requires only the dominant term of the Stirling formula 


nix J2nn z) (13) 
e 


It can be argued that; 


2an (2 <n!<J27n z) e! (12n) (14) 
e e 


One of the absolute musts is the ability to create random events so as to simulate random failures in 
the CS model. The probability ofa failure is setas a trigger of pseudo-random failures to simulate the failure 
of this or that device in the CS model. The failure trigger generates events of the model runtime. 
While the model is running, such events manifest as a drop in a channel-specific power in a QS to 
the minimum of 0.1. Full exclusion of the failing QS from the model is not an option as that would entail 
resizing the entire transmission matrix and could deteriorate the statistics collected before the failure. 
Therefore, post-failure restoration of the processing power is possible provided there is a pool of available 
virtual resources. The technology of aggregating the network channels and input-output channels backup can 
be simulated by several intermediate QS. Based on the collected model computation results, one can prove 
the efficiency of using the CR and the better balance of the simulated virtualized CS as compared to ordinary 
CS that cannot dynamically allocate resources. Therefore, given the above-described triggers, the models 
become adaptive; in other words, one can create adaptive models, specifically mathematical models used in 
combination with the operator-assigned dynamic characteristics, i.e. machine decision-making procedures 
applicable to adjustments in the model resources. 


4. RESULTS 
4.1. Model verification 

In general, verification means confirming that the CS model description fully matches 
the specification or the analyzed system. To check whether the experimental system works as planned, 
it is necessary to trace the system response to an input and compare it with the simulated response or to 
the response of another model [26]. Model verification is apparently a very important process that can be done 
in several ways: 
a. Experimentally 
b. By using simulation models in case of testing analytical models, or vice-a-versa 
c. By using a third method to build a similar model and compare the results 

Bearing in mind that papers [12-14] describe virtualized-CS models, the most rational way is to create 
virtual servers and measure their dynamic characteristics; in other words, to carry out a computational 
experiment. Verification tests have been run using a virtual server, see Table 1 for specifications. 


Table 1. Specifications of the virtual server used for model verification 


Resource name Quantity and characteristics 

p Two Power 5 virtual processors, 1.6 GHz (up to 1 physical processor allocated); also a double dedicated 
PRA processor. 

Cache L1 combinedcache: 64 kb instructions and 32 kb data; 1.9 Mb L2; 36 Mb L3 

Memory 2 Gb (range: 1 to 4 Gb) 

Drive 20 Gb, connected via a virtual I/O server, virtual SCSI adapter 

Network 100 Mbps or | Gbps, also a virtual Ethernet adapter, viaa virtual I/O server. 

OS AIX 6.1, 64bit 


Using queuing theory to describe adaptive mathematical models of computing... (Alexey I. Martyshkin) 


1114 O ISSN: 2302-9285 


Each virtual processor has SMT on, i.e. comprises two logical processors. Figure 4 shows an image 
from the hardware management console (HMC) connected to the physical server (System P) hosting 
the virtual server. 





)rstihmc: CBovictBa - Mozilla Firefox: IBM Editions) (=) 4 
my https:/ /9.156.38.166/hmc/ content?taskId=380&refresh=522 


General | Hardware Virtual adapters Parameters Misc. 








i + 
Name: Ipar_EA 
ID: 3 


Environment: AIX or Linux 


Status: Running 


Attention Indicator: Off 


Resource configuration: Configured 

OS version: AIX 6.1 6100-01-01-0323 
Current profile: lpar_EA_ profile 

System: 9133-55A* 06456BH 


OK | Cancel | Help | 





Finished 9,156.38.166 Q © Mo 





Figure 4. General partition properties 


To measure the functional indicators of the virtual server, this research uses the Nmon utility [27]; 
this utility can collect and log statistics on the virtual server operations. Data can further be visualized as graphs 
or shown in the console window in a symbolic form. Nmonanalyser is used to convert this statistic into 
a convenient representation [28]. Stress is used to generate server load, i.e. to simulate request processing in 
a way similar tothe models [29]. This C utility is extended to generate loads not by time but by the set number 
of simplest cycles (counter decrement and sqrt () function are computed instead of timeout). This effectively 
simulates running “requests” of a specific computational intensity so as to link model calculation results to 
verification results, as well as to draw findings on the performance. Besides, this program uses child processes, 
i.e. makes effective use of parallel processing. 

The experiment is a two-part test: it is to monitor a virtual server with allocated processing power, 
and a virtual server with a dedicated processor, using the load generation utility and dynamic reconfiguration 
to adjust to the load. While verifying a model, it should be borne in mind that any operating system (OS) runs 
various system processes that load the CPU(s) at 5% to 10% on average, which is comparable to 
the modeling error and represents the verification error, which is acceptable for engineering studies. 
This conclusion is confirmed by analyzing the Nmon-collected statistics on the idle load, as the virtual server 
is only running the OS itself, see Figure 5. 


Fxperiment Steps. 

a. Start Nmon to collect and log statistics every second until Stress completes a run 
Start Stress at the required computational intensity (number of requests) 

c. Collect statistics on the load of the virtual server or its separate subsystems while running in a static mode, 
i.e. without reconfiguring 

d. Collect statistics on the load of the virtual server or its separate subsystems while running in 
a dynamic mode 

e. Use Nmonanalyser to process the statistics 

f. Approximate the results and compare it to the modeled ones 
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User% BSys% OWait% 
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Figure 5. Idle systemload 


Assume that a request equals 10,000,000 cycles of an ordinary counter. First collect data on 
the static request processing time: the virtual server is subject to no reconfiguration, and only the number 
of processed request will change, see Figure 6. Stress is configured as follows: 
cpu 2: two threads are intensively loading the CPU by calculating sqrt () for a random number 
io 2: two threads are intensively loading the I/O system, namely the buffers 
hdd 1 --hdd-bytes 256M: 1 thread is intensively writing onto the disk in 256-Mb blocks 
vm 2 --vm-bytes 32M: 2 threads are intensively using the RAM in 32-Mb blocks 
loops: the number of countercycles or the number of requests, from 0.5 to 16 


choos 


Request processing time 


Number of requests 


—@—0.5 Processing Unit (1 Virtual Processor) = 2 logical processor, 2GB Memory 
—- 1.0 Processing Unit (2 Virtual Processor) =4 logical processor, 2GB Memory 





—k-2.0 Dedicated Processing Unit (2 Virtual Processor) = 4 logical processor, 2GB Memory 


Figure 6. Request processing time 


Data has been collected from 3 virtual server configurations that only differ in allocated processing 

power, RAM=2 Gb: 

a. 0.5 processorunits (1 virtual processor=2 logical processors) 

b. 1 processorunit (2 virtual processors=4 logical processors) 

c. 2 dedicated processorblocks (2 virtual processors =4 logical processors) 

Repeated measurements identify a reduction in the processing time as more iterations are run, which 
is explainable by greater amounts of data stored in the cache (up to 100%). Each CPU-loading thread 
is processed by a separate virtual processor, which makes clear the efficiency of using multithreading in 
software. For example, if there are only two threads, two of four virtual processors will be idle; in this case, 
virtualization enables flexible adaptation to the load. A similar feature was demonstrated in 
the models in [12-14]. 
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Maximum request processing speed is attained when the number of virtual processors equals 
the number of threads in a program. Mean request processing time of the three configurations equals 0.053, 
0.141, and 0.483 (requests/sec). Comparison of the results shows that the model behaves similarly to the real 
system, see Figure 7. Find below the results of virtual server monitoring as collected by Nmon 
and Nmonanalyser with PLM (partition load manager) enabled and a specified resource management policy. 
The batch job contains 20 requests. The batch is processed twice. The graphs show the “stepped” dynamic 
resource buildup, particularly in the case of CPU (Figure 8) and RAM (Figure 9). 


Total for the virtual server 


— Physical @PUI/O operations per second 





Figure 7. Summary virtual server statistics 


Dynamics of log. CPU ipar_EA 
vee. cry} 


Allocated processing power 





Figure 8. Dynamics of logical CPUs 


Virtual CPU dynamics 


Number of virtual CPUs 


w 
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N 
o 





Figure 9. Dynamics of virtual CPUs 
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The CPU utilization (load) will depend on how well a running program is parallelized in comparison 
to Stress. Approximating the graph above makes it clear that CPU are loaded at 100% right from receiving the 
first request; request multithreading maximizes the CPU utilization. Figure 10 shows the dynamics 
of load across all CPUs. 


CPU-total loading dynamics (Ipar_EA) 


@User% mw Sys% OWait% 
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€ 
5 
8 
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Figure 10. Dynamics of load across all CPUs 


Starting at 18:15, the graphs show the processing of the second request batch; apparently, it takes far 
less time to process since the maximum processing power is available right away, unlike in the case 
of the first batch, which the server used to adapt itself to the load. Efficient caching enhances 
the performance, too. As can be seen applicable to CPU7 and CPU8, increasing the number of processors 
beyond that number of the test threads will cause “unneeded” CPUs to idle. 

Thus, verification using a virtual server configured similarly to the model reveals similar dynamics of 
the server and the model, both in terms of the load and in the resource utilization; it also makes clear 
the effect of adaptability, which means that the models developed and described in [26-29] are appropriate for 
studying virtualized CS. However, there are some differences, as the input stream of the model 
and the real-world task batch are different. 


4.2. Adaptive model building and evaluation 
As mentioned earlier in [5], using a parallel algorithm for p processors as compared to sequential 


computing will solve the problem on P cpus P times faster than on a single CPU and/or multiply 
the amount of processed data by P; however, such acceleration is rarely attainable, as most executables 
are not optimized and feature a considerable portion of non-parallel code. Given that most state-of-the-art 
heavy-load software systems use parallelization, hardware utilization efficiency can be maximized by coupling 
parallel computing with virtualization for dynamic allocation of processing power and RAM. 

Judging from the above, building an efficient virtualized-CS model is key; such virtualization shall 
best suit the needs of an app planned to run on the future virtual server based on the model. On the other hand, 
given that virtualized CS adapt well to loads, the easiest approach would be using a minimum configuration 
and allow the CS to optimize its configuration while processing a task batch so as to adapt itself to the load. 
It is also possible to run each app on a separate virtual server with a separate OS, which will isolate 
the processes in terms of security and fail-safety. Let us define the basic criteria of modeling efficient 
virtualized CS: 

a. Adequate source data for the model, adjusted to the parameters of the future hardware host platform, as 
the accuracy of the nputs will directly affect the accuracy of simulation 

b. Flexibility of model adaptability to load, which is attained by using multiple criteria and conditions 

of adaptability triggering 

Cost optimization to the researcher’s requirements as part of the modeling effort 

d. Request processing speed and quality optimization to the researcher’s requirements as part of the modeling 
effort. Such optimization shall provide minimum request processing time, minimum set of resources, etc. 

e. Itis assumed that workload tasks are optimized for multithreading 

Powerful and efficient CS and visualization tools will considerably reduce the task processing, 
analysis, and prediction time applicable to electronic workflow, real-time transaction processing, and creation 


p 
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of data storages for decision-making systems,climate and global warming modeling, etc. However, a fast CPU 
is not everything. The architecture must be balanced to fully utilize the power of a modern CPU. Efficient 
computing platforms shall provide balanced performance in many aspects, including memory access, system 
switch, input-output, graphics accelerator, network operations, and CPU computing. 

As performance and scalability bar is being set ever higher, conventional workstations 
(even multiprocessor ones) become too expensive and impractical, making it more effective to use virtual 
servers with natural virtualization, since OS virtualization tools will use 10% to 30% of the CPU power while 
natural virtualization is provided as firmware level or by specialized hardware, which is way more efficient. 

Multiprocessorcomputing can be made more efficient by parallelization, which accelerates database 
query processing, provides efficient access to remote file systems, speeds up resource -intensive applications. 
Indeed, the POWER architecture provides such flexibility that additional virtual CPUs can simply be added 
if necessary, or pre-installed ones can be activated to handle peak loads. Besides, the OS and the related 
software and technologies do support and make active use of the hardware advantages this platform offers. 

Thus, for the above criteria, the method of modeling an adaptive system will comprise the followin g 
steps: 1) build a model in a software system (e.g., m [30]); 2) calculate and analyze the results; 3) define 
the significant cost, quality, and processing speed criteria; 4) vary the model parameters to optimize by 
the previously defined criteria; 5) compare to the initial version and make the necessary adjustments; 
6) create a virtual server to verify the model, as this method produces the most reliable data; 7) define 
the virtual server resource management policy specifying the pool of available resources or the donor group of 
virtual servers; 8) configure and start the server tasks; 9) monitor; 10) analyze the resource manager functioning 
and adjust the original virtual server profile to finalize the virtual server configuration. 


5. CONCLUSION 

This research has experimentally produced simple models for analyzing various systems designed for 
tasks of varying responsibility and requiring various resource groups to efficiently handle whatever they 
are tasked with. The proposed models can analyze various CS options that use virtualized resources with 
the above constraints. Speaking of the real-world application of virtual servers, outsourcing IT infrastructures 
is an increasingly popular solution, as it eliminates the need to purchase expensive servers that will also require 
hardware and software support. 

Therefore, there exist two separate products for resource and load management, which causes 
inconvenience and makes it difficult to configure an integrated system that would take into account both 
the resource load and the responsibility of each application. It is therefore optimal and convenient to use 
a single integrated resource management mechanism based on the resource loads and on the responsibility 
of tasks assigned to each partition. The researchers have verified the mathematical virtualized-CS models by 
using a similarly configured virtual server. 

Comparing the verification results and the calculations shows that the model and its virtual server 
implementation are identical in dynamics. The differences in the calculations and the experimental results 
are due to the difficulty of simulating the model-generated input stream of requests, which is quite abstract 
compared to real-world tasks; for maximum similarity, the research team has used a multithreaded load 
generator that clearly shows the specifics of multiprocessor CS with respect to the thread distribution between 
processors and thread parallelization. 

CS cost and service quality optimizations are case-specific; however, what can be concluded for 
use is that maximum performance is non-attainable within cost restrictions. The developed model for building 
virtualized-resource CS models and optimizing them by various criteria reveals an interesting effect: 
it is possible to build systems capable of self-adaptation to load while being autonomous, which reduces 
the maintenance costs. The effect is observed when flexible system resource management policies 
are configured. In conclusion, it should be noted that the considered adaptive systems feature us ing both 
model-generated and expert data for decision-making; the expert data are idiosyncratic decision charts provided 
by the researcher. 
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